Wokingham
LITERA: An LLM Based Approach to Latin-to-English Translation
This paper introduces an LLM-based Latin-to-English translation platform designed to address the challenges of translating Latin texts. We named the model LITERA, which stands for Latin Interpretation and Translations into English for Research Assistance. Through a multi-layered translation process utilizing a fine-tuned version of GPT-4o-mini and GPT-4o, LITERA offers an unprecedented level of accuracy, showcased by greatly improved BLEU scores, particularly in classical Latin, along with improved BLEURT scores. The development of LITERA involved close collaboration with Duke University's Classical Studies Department, which was instrumental in creating a small, high-quality parallel Latin-English dataset. This paper details the architecture, fine-tuning methodology, and prompting strategies used in LITERA, emphasizing its ability to produce literal translations.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Berkshire > Wokingham (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- (3 more...)
Do we actually understand the impact of renewables on electricity prices? A causal inference approach
Cacciarelli, Davide, Pinson, Pierre, Panagiotopoulos, Filip, Dixon, David, Blaxland, Lizzie
The energy transition is profoundly reshaping electricity market dynamics. It makes it essential to understand how renewable energy generation actually impacts electricity prices, among all other market drivers. These insights are critical to design policies and market interventions that ensure affordable, reliable, and sustainable energy systems. However, identifying causal effects from observational data is a major challenge, requiring innovative causal inference approaches that go beyond conventional regression analysis only. We build upon the state of the art by developing and applying a local partially linear double machine learning approach. Its application yields the first robust causal evidence on the distinct and non-linear effects of wind and solar power generation on UK wholesale electricity prices, revealing key insights that have eluded previous analyses. We find that, over 2018-2024, wind power generation has a U-shaped effect on prices: at low penetration levels, a 1 GWh increase in energy generation reduces prices by up to 7 GBP/MWh, but this effect gets close to none at mid-penetration levels (20-30%) before intensifying again. Solar power places substantial downward pressure on prices at very low penetration levels (up to 9 GBP/MWh per 1 GWh increase in energy generation), though its impact weakens quite rapidly. We also uncover a critical trend where the price-reducing effects of both wind and solar power have become more pronounced over time (from 2018 to 2024), highlighting their growing influence on electricity markets amid rising penetration. Our study provides both novel analysis approaches and actionable insights to guide policymakers in appraising the way renewables impact electricity markets.
- Oceania > Australia (0.04)
- North America > United States > California (0.04)
- Europe > Netherlands (0.04)
- (13 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.86)
CHILLI: A data context-aware perturbation method for XAI
Anwar, Saif, Griffiths, Nathan, Bhalerao, Abhir, Popham, Thomas
The trustworthiness of Machine Learning (ML) models can be difficult to assess, but is critical in high-risk or ethically sensitive applications. Many models are treated as a `black-box' where the reasoning or criteria for a final decision is opaque to the user. To address this, some existing Explainable AI (XAI) approaches approximate model behaviour using perturbed data. However, such methods have been criticised for ignoring feature dependencies, with explanations being based on potentially unrealistic data. We propose a novel framework, CHILLI, for incorporating data context into XAI by generating contextually aware perturbations, which are faithful to the training data of the base model being explained. This is shown to improve both the soundness and accuracy of the explanations.
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Europe > United Kingdom > England > West Midlands > Coventry (0.04)
- Europe > United Kingdom > England > Berkshire > Wokingham (0.04)
- Health & Medicine (1.00)
- Banking & Finance (0.68)
- Transportation (0.67)
Testing autonomous vehicles and AI: perspectives and challenges from cybersecurity, transparency, robustness and fairness
Llorca, David Fernández, Hamon, Ronan, Junklewitz, Henrik, Grosse, Kathrin, Kunze, Lars, Seiniger, Patrick, Swaim, Robert, Reed, Nick, Alahi, Alexandre, Gómez, Emilia, Sánchez, Ignacio, Kriston, Akos
Artificial Intelligence (AI) plays a critical role in the advancement of autonomous driving. It is likely the main facilitator of high levels of automation, as there are certain technical issues that only seem to be resolvable through advanced AI systems, particularly those based on machine learning. However, the introduction of AI systems in the realm of driver assistance systems and automated driving systems creates new uncertainties due to specific characteristics of AI that make it a distinct technology from traditional systems developed in the field of motor vehicles. Some of these characteristics include unpredictability, opacity, self and continuous learning and lack of causality [1], among other horizontal features such as autonomy, complexity, overfitting and bias. As an example of the specificity that the introduction of AI systems in vehicles entails, the UNECE's Working Party on Automated/Autonomous and Connected Vehicles (GRVA) has been specifically discussing the impact of AI on vehicle regulations since 2020 [2].
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > Germany (0.14)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- (13 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.45)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Security & Privacy (1.00)
- (3 more...)
Discovery and Recognition of Formula Concepts using Machine Learning
Scharpf, Philipp, Schubotz, Moritz, Cohl, Howard S., Breitinger, Corinna, Gipp, Bela
Citation-based Information Retrieval (IR) methods for scientific documents have proven effective for IR applications, such as Plagiarism Detection or Literature Recommender Systems in academic disciplines that use many references. In science, technology, engineering, and mathematics, researchers often employ mathematical concepts through formula notation to refer to prior knowledge. Our long-term goal is to generalize citation-based IR methods and apply this generalized method to both classical references and mathematical concepts. In this paper, we suggest how mathematical formulas could be cited and define a Formula Concept Retrieval task with two subtasks: Formula Concept Discovery (FCD) and Formula Concept Recognition (FCR). While FCD aims at the definition and exploration of a 'Formula Concept' that names bundled equivalent representations of a formula, FCR is designed to match a given formula to a prior assigned unique mathematical concept identifier. We present machine learning-based approaches to address the FCD and FCR tasks. We then evaluate these approaches on a standardized test collection (NTCIR arXiv dataset). Our FCD approach yields a precision of 68% for retrieving equivalent representations of frequent formulas and a recall of 72% for extracting the formula name from the surrounding text. FCD and FCR enable the citation of formulas within mathematical documents and facilitate semantic search and question answering as well as document similarity assessments for plagiarism detection or recommender systems.
- Europe > Germany > Lower Saxony > Gottingen (0.14)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States > Texas > Tarrant County > Fort Worth (0.04)
- (8 more...)
London is set for driverless car roll-out – so what comes next?
THE French Riviera is lovely at this time of year. The steering wheel spins to take the car round a bend – but my hands stay in my lap. And since there's no need to keep my eyes on the road, I'm free to enjoy the beachfront view. An oddly pixelated man with a two-dimensional windsurfer under his arm gives me the eye. Sadly, my Riviera is being projected on a large wrap-around screen in a room-sized simulator in Wokingham, UK.
- Europe > United Kingdom > England > Berkshire > Wokingham (0.25)
- North America > United States > Texas > Travis County > Austin (0.05)
- North America > United States > Texas > Collin County > Plano (0.05)
- (5 more...)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
Probabilistic Graphical Models on Multi-Core CPUs using Java 8
Masegosa, Andres R., Martinez, Ana M., Borchani, Hanen
In this paper, we discuss software design issues related to the development of parallel computational intelligence algorithms on multi-core CPUs, using the new Java 8 functional programming features. In particular, we focus on probabilistic graphical models (PGMs) and present the parallelisation of a collection of algorithms that deal with inference and learning of PGMs from data. Namely, maximum likelihood estimation, importance sampling, and greedy search for solving combinatorial optimisation problems. Through these concrete examples, we tackle the problem of defining efficient data structures for PGMs and parallel processing of same-size batches of data sets using Java 8 features. We also provide straightforward techniques to code parallel algorithms that seamlessly exploit multi-core processors. The experimental analysis, carried out using our open source AMIDST (Analysis of MassIve Data STreams) Java toolbox, shows the merits of the proposed solutions.
- North America > United States > California > Alameda County > Berkeley (0.04)
- Oceania > Samoa (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (9 more...)
- Information Technology > Software > Programming Languages (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
R&D Analyst: An Interactive Approach to Normative Decision System Model Construction
Regan, Peter J., Holtzman, Samuel
This paper describes the architecture of R&D Analyst, a commercial intelligent decision system for evaluating corporate research and development projects and portfolios. In analyzing projects, R&D Analyst interactively guides a user in constructing an influence diagram model for an individual research project. The system's interactive approach can be clearly explained from a blackboard system perspective. The opportunistic reasoning emphasis of blackboard systems satisfies the flexibility requirements of model construction, thereby suggesting that a similar architecture would be valuable for developing normative decision systems in other domains. Current research is aimed at extending the system architecture to explicitly consider of sequential decisions involving limited temporal, financial, and physical resources.
- North America > United States > California > San Mateo County > Menlo Park (0.05)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > San Mateo County > San Mateo (0.04)
- (5 more...)
Identifying Mislabeled Training Data
The goal of this approach is to improve classication accuracies produced by learning algorithms by improving the quality of the training data. Our approach uses a set of learning algorithms to create classiers that serve as noise lters for the training data. We evaluate single algorithm, majority vote and consensus lters on ve datasets that are prone to labeling errors. Our experiments illustrate that ltering signicantly improves classication accuracy for noise levels up to 30%. An analytical and empirical evaluation of the precision of our approach shows that consensus lters are conservative at throwing away good data at the expense of retaining bad data and that majority lters are better at detecting bad data at the expense of throwing away good data. This suggests that for situations in which there is a paucity of data, consensus lters are preferable, whereas majority vote lters are preferable for situations with an abundance of data. 1. Introducti The maximum accuracy achievable depends on the quality of the data and on the appropriateness of the chosen learning algorithm for the data. The work described here focuses on improving the quality of training data by identifying and eliminating mislabeled instances prior to applying the chosen learning algorithm, thereby increasing classication accuracy. Labeling error can occur for several reasons including subjectivity, data-entry error, or inadequacy of the information used to label each object. Subjectivity may arise when observations need to be ranked in some way such as disease severity or when the information used to label an object is dierent from the information to which the learning algorithm will have access. For example, when labeling pixels in image data, the analyst typically uses visual input rather than the numeric values of the feature vector corresponding to the observation. Domains in which experts disagree are natural places for subjective labeling errors (Smyth, 1996). A third cause of labeling error arises when the information used to label each observation is inadequate. For example, in the medical domain it may not be possible to perform the tests necessary to guarantee that a diagnosis is 100% accurate. For domains in which labeling errors occur, an automated method of eliminating or correcting mislabeled observations will improve the predictive accuracy of the classier formed from the training data. In this article we address the problem of identifying training instances that are mislabeled.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- North America > United States > New York (0.04)
- (14 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.68)
- Energy (0.68)
- Education (0.67)
- Government > Regional Government > North America Government > United States Government (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)
Exploiting Multi-Modal Interactions: A Unified Framework
Li, Ming (Nanjing University) | Xue, Xiao-Bing (Nanjing University) | Zhou, Zhi-Hua (Nanjing University)
Given an imagebase with tagged images, four types of tasks an be executed, i.e., content-based image retrieval, image annotation, text-based image retrieval, and query expansion. For any of these tasks the similarity on the concerned type of objects is essential. In this paper, we propose a framework to tackle these four tasks from a unified view. The essence of the framework is to estimate similarities by exploiting the interactions between objects of different modality. Experiments show that the proposed method can improve similarity estimation, and based on the improved similarity estimation, some simple methods can achieve better performances than some state-of-the-art techniques.
- Asia > Middle East > Jordan (0.05)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > Massachusetts (0.04)
- Europe > United Kingdom > England > Berkshire > Wokingham (0.04)